Parallel Python

Built-in functions

Operators

Throughout the course today we have written several function which perform simple tasks such as:

def add(x, y):
    """Function to return the sum of the two arguments"""
    return x + y

def product(x, y):
    """Function to return the product of the two arguments"""
    return x * y

def square(x):
    """Function to return the square of the argument"""
    return x * x

We had to do this since functional programming requires the passing of the names of real functions. We can't for example do:

map(*, a, b)

and expect it to multiply together the lists' elements. In our case we instead had to pass in our defined function product which takes two arguments:

map(product, a, b)

Since using mathematical operations in a functional context is very common in Python, it already provides functions which implement the operators in a module called operator:

In [1]:
import operator

operator.mul(5, 7)
Out[1]:
35

which means that we can use them in our maps, ProcessPoolExecutor.maps and reduces:

In [2]:
a = [1, 2, 3]
b = [4, 5, 6]

list(map(operator.mul, a, b))
Out[2]:
[4, 10, 18]
In [3]:
list(map(operator.pow, a, b))
Out[3]:
[1, 32, 729]

These functions can be used in reductions too:

In [4]:
from functools import reduce

reduce(operator.mul, [1, 2, 3, 4])
Out[4]:
24

Exercise

Take the answer to the first exercise in the Parallel map/reduce where we created countlines.py (solution here) and rewrite the line

total = reduce(lambda x, y: x + y, play_line_count)

to use the correct function from the operator module.

answer

Reductions

Another common thing we did with our results was to reduce them down to a single value by adding them together. We did this with:

In [5]:
reduce(lambda x, y: x + y, [1, 2, 3, 4])
Out[5]:
10

This is such a common thing to do that Python has a built-in function to add together all the numbers in a sequence, sum:

In [6]:
sum([1, 2, 3, 4])
Out[6]:
10

The documentation for this function is at built-in functions and there are a few other reduction functions such as min, max, any and all. Most reduction functions are simple and there's nothing wrong with writing your own but if there's already one provided by Python then it's probably worth using it.

Exercise

Edit countlines.py again to use the sum function in-place of the custom reduction.

answer

Exercise

Using a reduction function from the statistics module, edit countlines.py to also print the average of the number of lines per file at the end of the program.

answer